41 research outputs found

    Modelling the influence of masker bandwidth on BMLDs

    Get PDF

    Experimentele bepaling van de permeabiliteit van rattenhuid

    Get PDF

    Parametric coding of stereo audio

    Get PDF
    Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional) audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation

    Modeling binaural signal detection

    No full text
    With the advent of multimedia technology and powerful signal processing systems, audio processing and reproduction has gained renewed interest. Examples of products that have been developed are audio coding algorithms to efficiently store and transmit music and speech, or audio reproduction systems that create virtual sound sources. Usually, these systems have to meet the high audio quality of e.g. the compact-disc standard. Engineers have become aware of the fact that signal-to-noise ratios and distortion measures do not tell the whole story when it comes to sound quality. As a consequence, new algorithms have to be evaluated by extensive listening tests. Drawbacks of this method of evaluation are that these tests are expensive and time consuming. Moreover, listening tests usually do not give any insight why a specific algorithm does or does not work. Hence there is a demand for objective and fast evaluation tools for new audio technologies. One way to meet these demands is to develop a model of the auditory system that can predict the perceived distortion and which can indicate the nature of these distortions. This thesis describes and validates a model for the binaural hearing system. In particular, it aims at predicting the audibility of changes in arbitrary binaural stimuli. Two important properties for binaural hearing are interaural intensity differences (IIDs) and interaural time differences (ITDs) present in the waveforms arriving at both ears. These interaural differences enable us to estimate the position of a sound source but also contribute to our ability to detect signals in noisy environments. Hence one of the most important objectives for a comprehensive model is its ability to describe the sensitivity for interaural differences in a large variety of conditions. The basis of the model relies on psychoacoustic experiments that were performed with human listeners. In one series of experiments, subjects had to detect the presence of interaural cues for various statistical distributions of the IIDs and ITDs. The results revealed that the energy of the difference of the signals arriving at both ears following a peripheral filtering stage can successfully describe the sensitivity for interaural time and intensity differences. This approach is very similar to Durlach’s EC theory. Furthermore, other listening experiments with varying degrees of stimulus uncertainty revealed that the detection process of the binaural auditory system may well be simulated as a template-matching procedure. The idea of template matching based on the energy of the difference signal was incorporated in a time-domain detection model. This model transforms arbitrary stimuli into an internal representation. This representation comprises four dimensions: time, frequency channel, internal interaural delay and internal interaural level adjustment. The internal model activity as a function of these dimensions entails both binaural and monaural properties of the presented stimuli. The accuracy of these properties is limited by the addition of internal noise and by the limited frequency and time resolution incorporated in various stages of the model. An important model feature is the ’optimal detector’. This optimal detector analyzes the internal representation of the presented waveforms and extracts information from it, for example the presence or absence of a signal added to a masker. This process entails a strategy that optimally reduces the internal noise by integrating information across time and frequency channels. The model was tested for its ability to predict thresholds as a function of spectral and temporal stimulus parameters. During all simulations, all model parameters were kept constant. The results revealed that the model can account for a large variety of experimental data that are described in the literature. The most prominent finding was that the model can quantitatively account for the wider effective critical bandwidth observed in band-widening NoS?? experiments. This wider effective bandwidth is found if the threshold of audibility is measured for interaurally out-of-phase signals (S??) added to band-limited interaurally in-phase noise (No). In our model, this phenomenon is the result of the fact that the cue for detection is available in a range of filters if the masker bandwidth is sufficiently small. The increased effective bandwidth does therefore not reflect a worse binaural spectral resolution compared the monaural spectral resolution but it follows from the ability to integrate information across frequency. It was also shown that the optimal detector can account for effects found by manipulating temporal stimulus properties. To be more precise, the model can account for the phenomenon that the temporal resolution of the binaural auditory system obtained from stimuli with time-varying interaural correlations seems to be worse for sinusoidally-varying cross correlation than for rectangular correlation modulations. To extend the model’s predictive scope towards more natural listening conditions, experiments were performed with virtual sound sources, which were generated by using head-related transfer functions (HRTFs). The complexity of these impulse responses was gradually decreased by a spectral smoothing operation. During listening tests, subjects had to rate the audibility of this operation. The results revealed that the fine structure of HRTF phase and magnitude spectra is relatively unimportant for the generation of virtual sound sources in the horizontal plane. The same experiment was subsequently simulated with the model. Comparisons between subject data and model predictions showed that the model could not only predict whether the HRTF smoothing was audible or not, but that it could also predict the amount of perceptual degradation for supra-threshold HRTF smoothing

    A new binaural detection model based on contralateral inhibition

    No full text
    Binaural models attempt to explain binaural phenomena in terms of neural mechanisms that extract binaural information from accoustic stimuli. In this paper, a model setup is presented that can be used to simulate binaural detection tasks. In contrast to the most often used cross correlation between the right and left channel, this model is based on contralateral inhibition. The presented model is applied to a wide range of binaural detection experiments. It shows a good fit for changes in masker bandwidth or masker correlation, static and dynamic ues and level and frequency dependencies

    The influence of interaural stimulus uncertainty on binaural signal detection

    No full text
    This paper investigated the influence of stimulus uncertainty in binaural detection experiments and the predictions of several binaural models for such conditions. Masked thresholds of a 500-Hz sinusoid were measured in an NS condition for both running and frozen-noise maskers using a three interval, forced-choice (3IFC) procedure. The nominal masker correlation varied between 0.64 and 1, and the bandwidth of the masker was either 10, 100, or 1000 Hz. The running-noise thresholds were expected to be higher than the frozen-noise thresholds because of stimulus uncertainty in the running-noise conditions. For an interaural correlation close to +1, no difference between frozen-noise and running-noise thresholds was expected for all values of the masker bandwidth. These expectations were supported by the experimental data: for interaural correlations less than 1.0, substantial differences between frozen and running-noise conditions were observed for bandwidths of 10 and 100 Hz. Two additional conditions were tested to further investigate the influence of stimulus uncertainty. In the first condition a different masker sample was chosen on each trial, but the correlation of the masker was forced to a fixed value. In the second condition one of two independent frozen-noise maskers was randomly chosen on each trial. Results from these experiments emphasized the influence of stimulus uncertainty in binaural detection tasks: if the degree of uncertainty in binaural cues was reduced, thresholds decreased towards thresholds in the conditions without any stimulus uncertainty. In the analysis of the data, stimulus uncertainty was expressed in terms of three theories of binaural processing: the interaural correlation, the EC theory, and a model based on the processing of interaural intensity differences (IIDs) and interaural time differences (ITDs). This analysis revealed that none of the theories tested could quantitatively account for the observed thresholds. In addition, it was found that, in conditions with stimulus uncertainty, predictions based on correlation differ from those based on the EC theory

    Spectral and spatial parameter resolution requirements for parametric, filter-bank-based HRTF processing

    No full text
    The audibility of HRTF information reduction was investigated using a parametric analysis and synthesis approach. Nonindividualized HRTFs were characterized by magnitude and interaural phase properties computed for warped critical bands. The minimum number of parameters was established as a function of the HRTF set under test, the sound-source position, whether overlapping or nonoverlapping parameter bands were used, and whether spectral characteristics were derived from the HRTF magnitude or power spectrum domain. A three-interval, forced-choice procedure was employed to determine the required spectral resolution of the parameters and the minimum requirements for interaural phase reconstruc tion. The results indicated that, for pink-noise stimuli, the estimation of magnitude and interaural phase spectra per critical band is a sufficient prerequisite for transparent HRTF parameterization. Furthermore the low-frequency HRTF phase characteristics can be effi ciently described by a single interaural delay while disregarding the absolute phase response of the individual HRTFs. Also, high-frequency phase characteristics were found to be irrelevant for the HRTFs and the stimuli used for the test. Estimation of parameters in the spectral magnitude domain using overlapping parameter bands resulted in better quality compared to either power-domain parameter estimation or the use of nonoverlapping parameter bands. When HRTFs were reconstructed by interpolation of parameters in the spatial domain, a spatial measurement resolution of about lO~ was shown to be sufficient for high-quality binaural processing. Further reductions in spatial resolution predominantly give rise to monaural cues, which are stronger for interpolation in the vertical direction than in the horizontal direction. The results obtained provide clear design criteria for parametric HRTF processing in filter-bank-based applications such as MPEG Surround

    A closer look at the representation of interaural differences in a binaural model

    No full text
    In this contribution, we investigate the internal representations given by the binaural model proposed by Breebaart et al. [1]. We focus on the cues used by an arti??cial observer when determining just noticeable differences in interaural time differences (ITDs) or interaural level differences (ILDs). The binaural processor consists of an array of excitation-inhibition (EI) cells, each characterized by its internal interaural delay and internal interaural attenuation. This model is conceptually similar to a cross-correlation model and the relevant properties for ITD and ILD discrimination tasks are changes in the binaural pattern along the internal delay and the internal attenuation axis. For ITD discrimination of two coherent signals, the traditional approaches consist of searching for a displacement of the peak of the normalized cross-correlation function or measuring the difference in the cross-correlation function at lag zero. These approaches will be compared to our strategy which maximizes the differences between the two normalized cross-correlation functions. Threshold predictions depend on which position along the internal delay line is analyzed. The characteristics of the binaural pattern with regard to ILD discrimination tasks will also be investigated and compared to those that are relevant for ITD discrimination
    corecore